Online Knowledge Distillation via Multi-branch Diversity Enhancement
نویسندگان
چکیده
Knowledge distillation is an effective method to transfer the knowledge from cumbersome teacher model lightweight student model. Online uses ensembled prediction results of multiple models as soft targets train each However, homogenization problem will lead difficulty in further improving performance. In this work, we propose a new enhance diversity among models. We introduce Feature Fusion Module (FFM), which improves performance attention mechanism network by integrating rich semantic information contained last block Furthermore, use Classifier Diversification(CD) loss function strengthen differences between and deliver better ensemble result. Extensive experiments proved that our significantly enhances brings evaluate on three image classification datasets: CIFAR-10/100 CINIC-10. The show achieves state-of-the-art these datasets.
منابع مشابه
Branch Elimination via Multi-variable Condition Merging
Conditional branches are expensive. Branches require a signi cant percentage of execution cycles since they occur frequently and cause pipeline ushes when mispredicted. In addition, branches result in forks in the control ow, which can prevent other code-improving transformations from being applied. In this paper we describe pro le-based techniques for replacing the execution of a set of two or...
متن کاملUnderstanding Musical Diversity via Online Social Media
Musicologists and sociologists have long been interested in patterns of music consumption and their relation to socioeconomic status. In particular, the Omnivore Thesis examines the relationship between these variables and the diversity of music a person consumes. Using data from social media users of Last.fm and Twitter, we design and evaluate a measure that reasonably captures diversity of mu...
متن کاملSequence-Level Knowledge Distillation
Neural machine translation (NMT) offers a novel alternative formulation of translation that is potentially simpler than statistical approaches. However to reach competitive performance, NMT models need to be exceedingly large. In this paper we consider applying knowledge distillation approaches (Bucila et al., 2006; Hinton et al., 2015) that have proven successful for reducing the size of neura...
متن کاملTopic Distillation with Knowledge Agents
This is the second year that our group participates in TREC’s Web track. Our experiments focused on the Topic distillation task. Our main goal was to experiment with the Knowledge Agent (KA) technology [1], previously developed at our Lab, for this particular task. The knowledge agent approach was designed to enhance Web search results by utilizing domain knowledge. We first describe the generi...
متن کامل3D mesh segmentation via multi-branch 1D convolutional neural networks
There is an increasing interest in applying deep learning to 3D mesh segmentation. We observe that 1) existing feature-based techniques are often slow or sensitive to feature resizing, 2) there are minimal comparative studies and 3) techniques often suffer from reproducibility issue. This study contributes in two ways. First, we propose a novel convolutional neural network (CNN) for mesh segmen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-69538-5_20